Topics and trends in 50 years of fertility transition research: A comparative LDA analysis of modern and historical demography texts, 1964-2014

Johan Junkka

2016-11-13

Introduction

State of the field

“islands of knowledge in a large archipelago” - van de Kaa (1996)

which in the future is unlikely to result in any cohesive story

Aim

  1. providing a more comprehensive quantitative analysis of the main topics discussed across 50 years of research
  2. compare the topic distributions between modern and historical demography

Overview

  1. What is LDA analysis
  2. Data and sample
  3. Topics and examples
  4. Topic patterns
  5. Differences between modern and historical

Latent Dirichlet Allocation

LDA

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
  • Machine learning algorithm to find hidden patterns in text
  • clusters of co-occurring words which is interpreted as topics

Example

LDA example

From Blei (2012)

Implementations of LDA

Data and sample

Data and sample

source n prop
ebsco 1963 0.24
jstore 4725 0.57
wos 1653 0.20
TOTAL 8341 1.00

Query

fertility* AND 
  (
    "demographic transition*" OR 
    "fertility decline*" OR 
    "fertility transition*"
  )

Wide net filtered in post

Articles mentioning fertility transition 1964-2014

Articles mentioning fertility transition 1964-2014

Identify historical demography

Using

  • years between 1500-1939
  • 17th, 18th or 19th century
  • “reconstitution”
Top 20 journals for historical and modern articles

Top 20 journals for historical and modern articles

Data and sample

Drop words

soil, plant, vaccin, sperm, hormon, animal, testoster, eugen, suzhi, itravagin, cancer, mri, headach, depress, donor, donat, patient, epelepsi, migrain, rubber

Language detection

  • Split abstracts by “//” token
  • Classify text according to language n-gram profiles
  • Use only English text

Making tokens

Stop words

  • English stop words: “on”, “the”, “and” …

  • custom: “elsevier”,“reserved”, “ltd”, “rights”, “copyright”, “published”,“inc”,“journals”, “fertility”, “decline”, “transition”

Stemming

  • developing = develop
  • families = famili

Frequent words

Results

Top five relevant words by topic

Top five relevant words by topic

Example

Sandstrom, G; Vikstrom, L (2015) Sex preference for children in German villages during the fertility transition, POPULATION STUDIES-A JOURNAL OF DEMOGRAPHY

n Topic24 Topic32 Topic35 Topic37
1 sex desir centuri birth
2 ratio prefer nineteenth interv
3 son famili england cohort
4 prefer children marit pariti
5 femal husband industri childbear
6 male intent twentieth time
7 girl coupl class effect
8 daughter size marriag space
9 gender child control age
10 boy ideal histor born

In the past, parents’ sex preferences for their children have proved difficult to verify. This study used John Knodel’s German village genealogies of couples married between 1815 and 1899 to investigate sex preferences for children during the fertility transition. Event history analyses of couples’ propensity to progress to a fifth parity was used to test whether the probability of having additional children was influenced by the sex composition of surviving children. It appears that son preference influenced reproductive behaviour: couples having only girls experienced significantly higher transition rates than those having only boys or a mixed sibset. However, couples who married after about 1870 began to exhibit fertility behaviour consistent with the choice to have at least one surviving boy and girl. This result represents a surprisingly early move towards the symmetrical sex preference typical of modern European populations.

Topic themes

Distinct topics

n Topic24 Topic27 Topic34 Topic35 Topic6
1 sex neolith project centuri reproduct
2 ratio archaeolog forecast nineteenth evolutionari
3 son forag scenario england offspr
4 prefer island uncertainti marit human
5 femal hunter assumpt industri fit
6 male prehistor futur twentieth wealth
7 girl settlement model class success
8 daughter gather simul marriag life
9 gender popul predict control histori
10 boy der bayesian histor evolut

Method topics

n Topic16 Topic37 Topic39 Topic40 Topic52
1 model birth women region estim
2 variabl interv regress spatial data
3 effect cohort factor pattern census
4 estim pariti logist geograph survey
5 determin childbear analysi variat measur
6 analysi time odd brazil period
7 regress effect socioeconom differ adjust
8 equat space age level method
9 data age status demograph tempo
10 result born conclus diffus registr

Theoretical topic

n Topic1
1 theori
2 research
3 demograph
4 demographi
5 scienc
6 approach
7 review
8 sociolog
9 theoret
10 studi

Geographic topics

n Topic10 Topic13 Topic15 Topic22 Topic4 Topic41 Topic47 Topic48 Topic53
1 europ africa egypt china india latin japan australia itali
2 european saharan vietnam chines kerala asia soviet babi netherland
3 countri hiv malaysia provinc indian countri russia boom spain
4 germani african malay polici pradesh america iran canada italian
5 western aid singapor rural cast korea japanes war dutch
6 eastern south refuge child lanka indonesia turkey australian spanish
7 german zimbabw peninsular beij tamil east republ quebec home
8 poland uganda malaysian counti sri asian russian canadian memori
9 postpon malawi ethnic urban bengal colombia czech franc des
10 hungari epidem egyptian hebei district costa romania zealand leav

Demographic

n Topic11 Topic20 Topic28 Topic38 Topic44 Topic51 Topic54
1 model age marriag migrat rate mortal mortal
2 popul popul cohabit urban age infant diseas
3 age demograph union migrant marriag matern nutrit
4 rate elder marri rural marri child health
5 stabl polici marit citi birth death life
6 momentum increas format intern women surviv expect
7 distribut futur divorc destin woman mother epidemiolog
8 paramet consequ nonmarit emigr proport risk height
9 specif rate dissolut mexico increas birth adult
10 life dividend premarit mobil trend matlab childhood

Economic topics

Three-levels: Region, Individual, Global + Environment

n Topic21 Topic30 Topic31 Topic36 Topic7
1 incom agricultur labor environment growth
2 countri land employ popul capit
3 develop farm labour resourc model
4 inequ household market environ invest
5 nation rural forc sustain incom
6 cross farmer particip global endogen
7 econom crop wage climat human
8 rate product worker growth economi
9 hypothesi peasant job energi equilibrium
10 effect forest suppli degrad econom

Policy

n Topic29 Topic45 Topic9
1 secur polici health
2 pension plan servic
3 reform program care
4 tax famili medic
5 poverti programm access
6 welfar govern clinic
7 benefit popul facil
8 transfer develop public
9 incom implement provid
10 cost control communiti

Ability

n Topic18 Topic2 Topic33 Topic5
1 abort condom contracept infertil
2 pregnanc sexual method treatment
3 unintend hiv plan art
4 induc rubber women mental
5 women manufactur steril psycholog
6 legal sex famili stress
7 unwant prevent unmet medic
8 contracept partner modern ivf
9 pregnant risk manufactur distress
10 law infect pill cope

Perceptions

n Topic32 Topic55
1 desir interview
2 prefer qualit
3 famili women
4 children decis
5 husband research
6 intent social
7 coupl reproduct
8 size depth
9 child experi
10 ideal attitud

Social structures

n Topic14 Topic25 Topic3 Topic49
1 gender social polit immigr
2 women network social white
3 autonomi individu movement black
4 status communiti class american
5 power behavior discours ethnic
6 empower influenc global hispan
7 bangladesh neighborhood feminist unit
8 decis nepal cultur racial
9 equiti interact democrat puerto
10 household effect articl nativ

Family

n Topic12 Topic17
1 parent famili
2 children chang
3 famili modern
4 household societi
5 child cultur
6 mother social
7 live econom
8 sibl valu
9 intergener tradit
10 father institut

Over time

Distinct topics

Method topics

Geographic topics

Demographic

Economic topics

Three-levels: Region, Individual, Global + Environment

Policy

Ability

Perceptions

Social structures

Family

Topic patterns

Intratopic distance mapInteractive version: topic-vis

Frequent words by topic
1 2 3 4 5
educ countri contracept birth popul
women rural health age demograph
famili popul sexual rate growth
children urban abort mortal famili
marriag china social popul develop
child demograph women data polici
gender africa pregnanc women social
parent india behavior model econom
school mortal adolesc effect age
household migrat studi demograph human

Conditional topic distributions given word

Historical vs Moden

Modern vs Historical

clusters

CEDAR topics

CEDAR papers in corpus
author year title
Junkka, Johan 2016 Gender and fertility within the free churches in the Sundsvall region, Sweden, 1860–1921.
Sandström, Glenn 2014 The mid-twentieth century baby boom in Sweden - changes in the educational gradient of fertility for women born 1915-1950.
Sandstrom, G; Vikstrom, L 2015 Sex preference for children in German villages during the fertility transition
Nordin, G; Skold, P 2012 True or false? Nineteenth-century Sapmi fertility in qualitative vs. demographic sources
Goran Brostrom 1985 Practical Aspects on the Estimation of the Parameters in Coale’s Model for Marital Fertility
Sören Edvinsson;Anders Brändström;John Rogers;Göran Broström 2005 High-Risk Families: The Unequal Distribution of Infant Mortality in Nineteenth-Century Sweden

CEDAR topic map

Largest topics

Conclusions

Conclusions

  • Fertility transition studies dominated by empirical and descriptive topics
  • Historical demography week on perceptions, use of contraception and qualitative topics
  • Historical demography strong towards demographic, emperical and economic topics
  • CEDAR research similar to general historical demographic topic distribution

Interactive exploration topic-vis